Closed Bug 1823412 Opened 2 years ago Closed 2 years ago

gpu process pegged with 97% usage loading PremierOpinion's pmls64.dll

Categories

(Core :: Graphics, defect)

defect

Tracking

()

RESOLVED FIXED
113 Branch
Tracking Status
firefox113 --- fixed

People

(Reporter: chutten, Assigned: yannis)

References

Details

(Whiteboard: [win:stability])

Attachments

(1 file)

User reported severe lag in Firefox. Instructed them to check about:processes, used it to track it to the GPU process, and used the built in "profile all threads for 5s" button (kudos to all teams involved in making this so very straightforward to do) to gather https://share.firefox.dev/42lFKgI which shows a whole lotta DllLoad for pmls64.dll which is connected to "PremierOpinion".

Anything we can do about this? The user's taking actions to scrub the unwanted third-party software... but could we have prevented it from causing such grief in Firefox?

See Also: → 1809519
Whiteboard: [win:stability]

Am I reading the profile wrong? It looks like the DllLoad calls are taking on the order of 250us, while the NotifyObservers calls are taking on the order of 1us. So while I agree that we could probably do something more sophisticated about notifying, it doesn't seem like that would help very much in this case.

We have seen issues like this before when we add a DLL that gets loaded by a window hook to the DLL blocklist, and our advice is...don't block those DLLs, which isn't helpful in this case since we're not doing the blocking.

I may be misreading this as well, not a profiler expert, but it also looks to me like most of the time we're in NotifyObservers is being spent on recording data for the profile itself, code that would be skipped when a profile isn't being collected. So I do think the observer notifications might be a red herring here.

Assignee: nobody → yjuglaret
No longer blocks: gfx-triage

Thanks Greg and Molly for your comments, they were helpful. Indeed my first comment was incorrect.

I managed to reproduce exactly the profile shared in the description, as well as the severe lag. It happens when a DLL that is registered with SetWindowsHookExW for WH_GETMESSAGE gets blocked by our blocklist. I managed to reproduce the profile by registering my own DLL and blocking it in about:third-party, then I realized that the story is a little bit different for pmls64.dll...

(In reply to Greg Stoll from comment #2)

We have seen issues like this before when we add a DLL that gets loaded by a window hook to the DLL blocklist, and our advice is...don't block those DLLs, which isn't helpful in this case since we're not doing the blocking.

... In fact we are doing the blocking! This DLL is hard-blocked in our static blocklist.

First let me clarify what we see in the profile now that I understand it better. For a start, the raw addresses ending in 74 and b4 without symbols correspond to the following stacks:

 # Child-SP          RetAddr               Call Site
00 0000001f`e6ffdb78 00007ff6`2bdb03d6     0x00000216`7a8c2c74 (orig_OpenFile)
01 0000001f`e6ffdb80 00007ff6`2bdb5125     firefox!TargetNtOpenFile+0x56 [/builds/worker/checkouts/gecko/security/sandbox/chromium/sandbox/win/src/filesystem_interception.cc @ 125] 
02 0000001f`e6ffdcf0 00007ff9`3c46064c     firefox!TargetNtOpenFile64+0x35 [/builds/worker/checkouts/gecko/security/sandbox/chromium/sandbox/win/src/interceptors_64.cc @ 120] 
03 0000001f`e6ffdd30 00007ff9`3c4610e0     ntdll!LdrpMapDllNtFileName+0xe8
04 0000001f`e6ffde30 00007ff9`3c460eaf     ntdll!LdrpMapDllFullPath+0xe0
05 0000001f`e6ffdfc0 00007ff9`3c4788c4     ntdll!LdrpProcessWork+0x77
06 0000001f`e6ffe010 00007ff9`3c468c9c     ntdll!LdrpLoadDllInternal+0x1a0
07 0000001f`e6ffe0b0 00007ff9`3c47a24a     ntdll!LdrpLoadDll+0xb0
08 0000001f`e6ffe270 00007ff6`2bd98803     ntdll!LdrLoadDll+0xfa
09 (Inline Function) --------`--------     firefox!mozilla::interceptor::FuncHookCrossProcess<mozilla::interceptor::WindowsDllInterceptor<mozilla::interceptor::VMSharingPolicyUnique<mozilla::interceptor::MMPolicyOutOfProcess> >,long (*)(wchar_t *, unsigned long *, _UNICODE_STRING *, void **)>::operator()+0x19 [/builds/worker/workspace/obj-build/dist/include/nsWindowsDllInterceptor.h @ 254] 
0a 0000001f`e6ffe360 00007ff9`39bbbb92     firefox!mozilla::freestanding::patched_LdrLoadDll+0x1b3 [/builds/worker/checkouts/gecko/browser/app/winlauncher/freestanding/DllBlocklist.cpp @ 360] 
0b 0000001f`e6ffe440 00007ff9`3a51e067     KERNELBASE!LoadLibraryExW+0x172
0c 0000001f`e6ffe4b0 00007ff9`3c4f2c94     user32!_ClientLoadLibrary+0xa7
0d 0000001f`e6ffe650 00007ff9`39f314d4     ntdll!KiUserCallbackDispatcherContinue
0e 0000001f`e6ffe768 00007ff9`3a531bdf     win32u!NtUserPeekMessage+0x14
0f 0000001f`e6ffe770 00007ff9`3a531aac     user32!_PeekMessage+0x3f
10 0000001f`e6ffe7e0 00007ff8`63d37d4c     user32!PeekMessageW+0x9c
...

 # Child-SP          RetAddr               Call Site
00 0000001f`e6ffdb98 00007ff6`2bdb0885     0x00000216`7a8c2cb4 (orig_QueryAttributes)
01 0000001f`e6ffdba0 00007ff9`3c47ac39     firefox!TargetNtQueryAttributesFile+0x35 [/builds/worker/checkouts/gecko/security/sandbox/chromium/sandbox/win/src/filesystem_interception.cc @ 204] 
02 0000001f`e6ffdce0 00007ff9`3c47a3f7     ntdll!LdrpGetNtPathFromDosPath+0x95
03 0000001f`e6ffddc0 00007ff9`3c461072     ntdll!LdrpResolveDllName+0x103
04 0000001f`e6ffde30 00007ff9`3c460eaf     ntdll!LdrpMapDllFullPath+0x72
05 0000001f`e6ffdfc0 00007ff9`3c4788c4     ntdll!LdrpProcessWork+0x77
06 0000001f`e6ffe010 00007ff9`3c468c9c     ntdll!LdrpLoadDllInternal+0x1a0
07 0000001f`e6ffe0b0 00007ff9`3c47a24a     ntdll!LdrpLoadDll+0xb0
08 0000001f`e6ffe270 00007ff6`2bd98803     ntdll!LdrLoadDll+0xfa
09 (Inline Function) --------`--------     firefox!mozilla::interceptor::FuncHookCrossProcess<mozilla::interceptor::WindowsDllInterceptor<mozilla::interceptor::VMSharingPolicyUnique<mozilla::interceptor::MMPolicyOutOfProcess> >,long (*)(wchar_t *, unsigned long *, _UNICODE_STRING *, void **)>::operator()+0x19 [/builds/worker/workspace/obj-build/dist/include/nsWindowsDllInterceptor.h @ 254] 
0a 0000001f`e6ffe360 00007ff9`39bbbb92     firefox!mozilla::freestanding::patched_LdrLoadDll+0x1b3 [/builds/worker/checkouts/gecko/browser/app/winlauncher/freestanding/DllBlocklist.cpp @ 360] 
0b 0000001f`e6ffe440 00007ff9`3a51e067     KERNELBASE!LoadLibraryExW+0x172
0c 0000001f`e6ffe4b0 00007ff9`3c4f2c94     user32!_ClientLoadLibrary+0xa7
0d 0000001f`e6ffe650 00007ff9`39f314d4     ntdll!KiUserCallbackDispatcherContinue
0e 0000001f`e6ffe768 00007ff9`3a531bdf     win32u!NtUserPeekMessage+0x14
0f 0000001f`e6ffe770 00007ff9`3a531aac     user32!_PeekMessage+0x3f
10 0000001f`e6ffe7e0 00007ff8`63d37d4c     user32!PeekMessageW+0x9c
...

Now, why do we not have full stacks in the profile for orig_OpenFile, orig_QueryAttributes, NtClose, NtCreateSection, ZwMapViewOfSection, ZwQueryVirtualMemory, ZwUnmapViewOfSection? Because they all happen in LdrLoadDll, and before calling into LdrLoadDll we disable stack walking over concerns about the function table locks, and we restore it only after the call. So even though these calls seem to appear out of nowhere in the profile, they should be understood as having the same call stack as what I just shared above.

When we understand that, it confirms that the process spends ~100% of its time in nsThread::ProcessNextEvent. In fact, we are facing an infinite cycle of events, where the code that we execute to process an event will itself generate a new DLL load event that needs to be processed, and again, leaving the process no time to rest. Here is how the cycle appears:

  • a first legitimate native event arrives;
  • we reach nsThread::ProcessNextEvent to look for an event;
  • that calls into nsBaseAppShell::OnProcessNextEvent, which calls into PeekMessageW;
  • there is an event, and pmls64.dll is not loaded yet, so we reach _ClientLoadLibrary to load pmls64.dll where the registered hook lives;
  • that calls patched_LdrLoadDll, which will fail to load the DLL because it is blocked, then will call into DllServices::DispatchDllLoadNotification to register a new DLL load event for the main thread, without actually posting it for the moment;
  • since loading the DLL failed, we handle the original event without applying the hook, then we return from PeekMessageW;
  • the next call to nsThread::ProcessNextEvent will reach EnsureMainThreadTasksScheduled, where the DLL load event will be posted with PostMessage as a new legitimate native event, right before going into nsBaseAppShell::OnProcessNextEvent, thus starting a new iteration of the cycle...

Note that we only have a cycle because of blocking. If we do not block the DLL, we avoid the cycle:

  • if the DLL loads successfully, further calls into PeekMessageW will not call into _ClientLoadLibrary for the same library anymore, they will call the hook directly;
  • if the DLL fails to load for another reason than blocking (e.g. DllMain returns FALSE on DLL_PROCESS_ATTACH), we do not generate any DLL load event.

The cycle described above exists even when the profiler is not attached.

(In reply to Yannis Juglaret [:yannis] from comment #4)

Note that we only have a cycle because of blocking. If we do not block the DLL, we avoid the cycle:

  • if the DLL loads successfully, further calls into PeekMessageW will not call into _ClientLoadLibrary for the same library anymore, they will call the hook directly;
  • if the DLL fails to load for another reason than blocking (e.g. DllMain returns FALSE on DLL_PROCESS_ATTACH), we do not generate any DLL load event.

[:gcp] noticed that by blaming that code, we can see that the possibility to register events about failure to load a blocked DLL was added in bug 1686229. It allows us to log information in third-party module pings about which DLLs have been actively blocked. We think we should be able to break the cycle by sending at most one event per blocked DLL.

I created a Github repository to share the reproducer.

Steps to reproduce:

PS > Get-FileHash .\HookTests.exe

Algorithm       Hash                                                                   Path
---------       ----                                                                   ----
SHA256          5EBEDD581258A1C29F4883B8B443C2B4EF86F442C0D53954AA6E4A8409DAB14D       .\HookTests.exe

PS > Get-FileHash .\HookLib.dll

Algorithm       Hash                                                                   Path
---------       ----                                                                   ----
SHA256          CC4509C7D621B0F69584FF7FB082C9DC63C178133D03ED362C713DAF065D7C56       .\HookLib.dll
  • Double click on HookTests.exe. SmartScreen prevents you from running the executable. Click More info, Run anyway.
  • HookTests.exe launches and should show: SUCCESS: The hook is installed.
  • Open Firefox and navigate to about:third-party. You should see HookLib.dll listed.
  • In Firefox, click Reload with system info. A new button Block this module appears near HookLib.dll. Click it. Click Restart Firefox now.
  • When Firefox restarts, it should be extremely slow. If you can, navigate to about:processes, it should show that the main process and the GPU process together use around 100% of CPU.
  • Back in HookTests.exe, press ENTER, it should show: SUCCESS: The hook is uninstalled. Press ENTER again.
  • Back in Firefox, you should now experience normal performance.
  • Navigate to about:third-party. Click Reload with system info. Click the Currently blocked button near HookLib.dll to unblock the module. Click Restart Firefox now.

Expected result: Once this bug is fixed, performing these steps should not result in bad performance at any point of Firefox use. The main process and the GPU process should not use 100% CPU together but have normal CPU usage.

Has STR: --- → yes

I was able to reproduce the issue following the STR from comment 6 in a local build. Applying the following patch removes the issue, suggesting that the analysis from comment 4 is correct:

diff --git a/toolkit/xre/dllservices/mozglue/LoaderObserver.cpp b/toolkit/xre/dllservices/mozglue/LoaderObserver.cpp
--- a/toolkit/xre/dllservices/mozglue/LoaderObserver.cpp
+++ b/toolkit/xre/dllservices/mozglue/LoaderObserver.cpp
@@ -58,9 +58,7 @@ void LoaderObserver::OnEndDllLoad(void*
             utf8RequestedDllName.get()));
   }

-  // We want to record a denied DLL load regardless of |aNtStatus| because
-  // |aNtStatus| is set to access-denied when DLL load was blocked.
-  if ((!NT_SUCCESS(aNtStatus) && !aModuleLoadInfo.WasDenied()) ||
+  if (!NT_SUCCESS(aNtStatus) ||
       !aModuleLoadInfo.WasMapped()) {
     return;
   }

Of course this patch would prevent logging blocked modules in third-party module pings, so this is not the patch we want to apply to solve this issue.

Attachment #9325703 - Attachment description: Bug 1823412 - Notify blocked DLL loads only once per blocked DLL in LoaderObserver. r=mhowell,gstoll → Bug 1823412 - Notify main thread blocked DLL loads only once per DLL in WindowsDllServices. r=mhowell,gstoll

Here are profiles captured from a local build while the hook is active, following the STR from comment 6, before and after the proposed patch:

  • before patch: GPU and parent processes are flooded with DLLLoad events for HookLib.dll and consuming a lot of CPU, GPU process main thread is active 100% of the time, parent process main thread is active 100% of the time too except when it is waiting for the GPU process to do something;
  • after patch: the DLLLoad events for HookLib.dll correlate with DOMEvent (parent process) and ReceiveMouseInputEvent (GPU), CPU usage is back to normal.
See Also: → 1747980
Duplicate of this bug: 1747980
Pushed by yjuglaret@mozilla.com: https://hg.mozilla.org/integration/autoland/rev/d79bd97efa79 Notify main thread blocked DLL loads only once per DLL in WindowsDllServices. r=mhowell,gstoll
Regressions: 1826093
See Also: → 1826157
Status: NEW → RESOLVED
Closed: 2 years ago
Resolution: --- → FIXED
Target Milestone: --- → 113 Branch
See Also: → 1842541
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: